Sparse Interpretible Audio Model

Table of Contents

Model Architecture

This small model attempts to decompose audio featuring acoustic instruments into the following components:

While global context and local event data are encoded as real-valued vectors and not discrete values, the representation learned still lends itself to a sparse, interpretible, and hopefully easy-to-manipulate encoding. This first draft was trained using the amazing MusicNet dataset.

Each sound sample below includes the following elements:

  1. The original recording
  2. The model's reconstruction
  3. New audio using the original timing and context vector, but random event vectors
  4. New audio using the original event and context vectors, but with random timings
  5. New audio using the original timing and event vectors, but with a random global context vector

Future Directions

There are several areas that could provide further gains in compression and interpretibility:

Cite this Work

                    
@misc{vinyard2023audio,
    author = {Vinyard, John},
    title = {Sparse Interpetable Audio},
    url = {https://JohnVinyard.github.io/machine-learning/2023/11/15/sparse-physical-model.html},
    year = 2024
}
                    
                

Sound Samples

Original

Recon

With Random Event Vectors

(based on mean and variance of event vectors for this sample)

With Random Timings

With Random Global Context Vector

Global Context Vector for Original

individual events can be played by clicking on vectors

Timeline

Original

Recon

With Random Event Vectors

(based on mean and variance of event vectors for this sample)

With Random Timings

With Random Global Context Vector

Global Context Vector for Original

individual events can be played by clicking on vectors

Timeline

Original

Recon

With Random Event Vectors

(based on mean and variance of event vectors for this sample)

With Random Timings

With Random Global Context Vector

Global Context Vector for Original

individual events can be played by clicking on vectors

Timeline

Original

Recon

With Random Event Vectors

(based on mean and variance of event vectors for this sample)

With Random Timings

With Random Global Context Vector

Global Context Vector for Original

individual events can be played by clicking on vectors

Timeline

Original

Recon

With Random Event Vectors

(based on mean and variance of event vectors for this sample)

With Random Timings

With Random Global Context Vector

Global Context Vector for Original

individual events can be played by clicking on vectors

Timeline

Original

Recon

With Random Event Vectors

(based on mean and variance of event vectors for this sample)

With Random Timings

With Random Global Context Vector

Global Context Vector for Original

individual events can be played by clicking on vectors

Timeline

Original

Recon

With Random Event Vectors

(based on mean and variance of event vectors for this sample)

With Random Timings

With Random Global Context Vector

Global Context Vector for Original

individual events can be played by clicking on vectors

Timeline

Original

Recon

With Random Event Vectors

(based on mean and variance of event vectors for this sample)

With Random Timings

With Random Global Context Vector

Global Context Vector for Original

individual events can be played by clicking on vectors

Timeline

Original

Recon

With Random Event Vectors

(based on mean and variance of event vectors for this sample)

With Random Timings

With Random Global Context Vector

Global Context Vector for Original

individual events can be played by clicking on vectors

Timeline